50 research outputs found
Trust-based algorithms for fusing crowdsourced estimates of continuous quantities
Crowdsourcing has provided a viable way of gathering information at unprecedented volumes and speed by engaging individuals to perform simple micro–tasks. In particular, the crowdsourcing paradigm has been successfully applied to participatory sensing, in which the users perform sensing tasks and provide data using their mobile devices. In this way, people can help solve complex environmental sensing tasks, such as weather monitoring, nuclear radiation monitoring and cell tower mapping, in a highly decentralised and parallelised fashion. Traditionally, crowdsourcing technologies were primarily used for gathering data for classifications and image labelling tasks. In contrast, such crowd–based participatory sensing poses new challenges that relate to (i) dealing with human–reported sensor data that are available in the form of continuous estimates of an observed quantity such as a location, a temperature or a sound reading, (ii) dealing with possible spatial and temporal correlations within the data and (ii) issues of data trustworthiness due to the unknown capabilities and incentives of the participants and their devices. Solutions to these challenges need to be able to combine the data provided by multiple users to ensure the accuracy and the validity of the aggregated results. With this in mind, our goal is to provide methods to better aid the aggregation process of crowd–reported sensor estimates of continuous quantities when data are provided by individuals of varying trustworthiness. To achieve this, we develop a trust–based in- formation fusion framework that incorporates latent trustworthiness traits of the users within the data fusion process. Through this framework, we develop a set of four novel algorithms (MaxTrust, BACE, TrustGP and TrustLGCP) to compute reliable aggregations of the users’ reports in both the settings of observing a stationary quantity (Max- Trust and BACE) and a spatially distributed phenomenon (TrustGP and TrustLGCP). The key feature of all these algorithm is the ability of (i) learning the trustworthiness of each individual who provide the data and (ii) exploit this latent user’s trustworthiness information to compute a more accurate fused estimate. In particular, this is achieved by using a probabilistic framework that allows our methods to simultaneously learn the fused estimate and the users’ trustworthiness from the crowd reports. We validate our algorithms in four key application areas (cell tower mapping, WiFi networks mapping, nuclear radiation monitoring and disaster response) that demonstrate the practical impact of our framework to achieve substantially more accurate and informative predictions compared to the existing fusion methods. We expect that results of this thesis will allow to build more reliable data fusion algorithms for the broad class of human–centred information systems (e.g., recommendation systems, peer reviewing systems, student grading tools) that are based on making decisions upon subjective opinions provided by their users
Trust-Based Fusion of Untrustworthy Information in Crowdsourcing Applications
In this paper, we address the problem of fusing untrustworthy reports provided from a crowd of observers, while simultaneously learning the trustworthiness of individuals. To achieve this, we construct a likelihood model of the userss trustworthiness by scaling the uncertainty of its multiple estimates with trustworthiness parameters. We incorporate our trust model into a fusion method that merges estimates based on the trust parameters and we provide an inference algorithm that jointly computes the fused output and the individual trustworthiness of the users based on the maximum likelihood framework. We apply our algorithm to cell tower localisation using real-world data from the OpenSignal project and we show that it outperforms the state-of-the-art methods in both accuracy, by up to 21%, and consistency, by up to 50% of its predictions. Copyright © 2013, International Foundation for Autonomous Agents and Multiagent Systems (www.ifaamas.org). All rights reserved
Time-Sensitive Bayesian Information Aggregation for Crowdsourcing Systems
Crowdsourcing systems commonly face the problem of aggregating multiple
judgments provided by potentially unreliable workers. In addition, several
aspects of the design of efficient crowdsourcing processes, such as defining
worker's bonuses, fair prices and time limits of the tasks, involve knowledge
of the likely duration of the task at hand. Bringing this together, in this
work we introduce a new time--sensitive Bayesian aggregation method that
simultaneously estimates a task's duration and obtains reliable aggregations of
crowdsourced judgments. Our method, called BCCTime, builds on the key insight
that the time taken by a worker to perform a task is an important indicator of
the likely quality of the produced judgment. To capture this, BCCTime uses
latent variables to represent the uncertainty about the workers' completion
time, the tasks' duration and the workers' accuracy. To relate the quality of a
judgment to the time a worker spends on a task, our model assumes that each
task is completed within a latent time window within which all workers with a
propensity to genuinely attempt the labelling task (i.e., no spammers) are
expected to submit their judgments. In contrast, workers with a lower
propensity to valid labeling, such as spammers, bots or lazy labelers, are
assumed to perform tasks considerably faster or slower than the time required
by normal workers. Specifically, we use efficient message-passing Bayesian
inference to learn approximate posterior probabilities of (i) the confusion
matrix of each worker, (ii) the propensity to valid labeling of each worker,
(iii) the unbiased duration of each task and (iv) the true label of each task.
Using two real-world public datasets for entity linking tasks, we show that
BCCTime produces up to 11% more accurate classifications and up to 100% more
informative estimates of a task's duration compared to state-of-the-art
methods
From Manifesta to Krypta: The Relevance of Categories for Trusting Others
In this paper we consider the special abilities needed by agents for assessing trust based on inference and reasoning. We analyze the case in which it is possible to infer trust towards unknown counterparts by reasoning on abstract classes or categories of agents shaped in a concrete application domain. We present a scenario of interacting agents providing a computational model implementing different strategies to assess trust. Assuming a medical domain, categories, including both competencies and dispositions of possible trustees, are exploited to infer trust towards possibly unknown counterparts. The proposed approach for the cognitive assessment of trust relies on agents' abilities to analyze heterogeneous information sources along different dimensions. Trust is inferred based on specific observable properties (Manifesta), namely explicitly readable signals indicating internal features (Krypta) regulating agents' behavior and effectiveness on specific tasks. Simulative experiments evaluate the performance of trusting agents adopting different strategies to delegate tasks to possibly unknown trustees, while experimental results show the relevance of this kind of cognitive ability in the case of open Multi Agent Systems
Facing Openness with Socio Cognitive Trust and Categories.
Typical solutions for agents assessing trust relies on the circulation of information on the individual level, i.e. reputational images, subjective experiences, statistical analysis, etc. This work presents an alternative approach, inspired to the cognitive heuristics enabling humans to reason at a categorial level. The approach is envisaged as a crucial ability for agents in order to: (1) estimate trustworthiness of unknown trustees based on an ascribed membership to categories; (2) learn a series of emergent relations between trustees observable properties and their effective abilities to fulfill tasks in situated conditions. On such a basis, categorization is provided to recognize signs (Manifesta) through which hidden capabilities (Kripta) can be inferred. Learning is provided to refine reasoning attitudes needed to ascribe tasks to categories. A series of architectures combining categorization abilities, individual experiences and context awareness are evaluated and compared in simulated experiments
Reasoning with Categories for Trusting Strangers: a Cognitive Architecture
A crucial issue for agents in open systems is the ability to filter out information sources in order to build an image of their counterparts, upon which a subjective evaluation of trust as a promoter of interactions can be assessed. While typical solutions discern relevant information sources by relying on previous experiences or reputational images, this work presents an alternative approach based on the cognitive ability to: (i) analyze heterogeneous information sources along different dimensions; (ii) ascribe qualities to unknown counterparts based on reasoning over abstract classes or categories; and, (iii) learn a series of emergent relationships between particular properties observable on other agents and their effective abilities to fulfill tasks. A computational architecture is presented allowing cognitive agents to dynamically assess trust based on a limited set of observable properties, namely explicitly readable signals (Manifesta) through which it is possible to infer hidden properties and capabilities (Krypta), which finally regulate agents' behavior in concrete work environments. Experimental evaluation discusses the effectiveness of trustor agents adopting different strategies to delegate tasks based on categorization
A Personalised Reader for Crowd Curated Content
Personalised news recommender systems traditionally rely on content ingested from a select set of publishers and ask users to indicate their interests from a predefined list of top- ics. They then provide users a feed of news items for each of their topics. In this demo, we present a mobile app that automatically learns users’ interests from their browsing or twitter history and provides them with a personalised feed of diverse, crowd curated content. The app also continuously learns from the users’ interactions as they swipe to like or skip items recommended to them. In addition, users can discover trending stories and content liked by other users they follow. The crowd is thus formed of the users, who as a whole act as the curators of the content to be recommended
Reply With: Proactive Recommendation of Email Attachments
Email responses often contain items-such as a file or a hyperlink to an
external document-that are attached to or included inline in the body of the
message. Analysis of an enterprise email corpus reveals that 35% of the time
when users include these items as part of their response, the attachable item
is already present in their inbox or sent folder. A modern email client can
proactively retrieve relevant attachable items from the user's past emails
based on the context of the current conversation, and recommend them for
inclusion, to reduce the time and effort involved in composing the response. In
this paper, we propose a weakly supervised learning framework for recommending
attachable items to the user. As email search systems are commonly available,
we constrain the recommendation task to formulating effective search queries
from the context of the conversations. The query is submitted to an existing IR
system to retrieve relevant items for attachment. We also present a novel
strategy for generating labels from an email corpus---without the need for
manual annotations---that can be used to train and evaluate the query
formulation model. In addition, we describe a deep convolutional neural network
that demonstrates satisfactory performance on this query formulation task when
evaluated on the publicly available Avocado dataset and a proprietary dataset
of internal emails obtained through an employee participation program.Comment: CIKM2017. Proceedings of the 26th ACM International Conference on
Information and Knowledge Management. 201
The ActiveCrowdToolkit: an open-source tool for benchmarking active learning algorithms for crowdsourcing research
We present an open-source toolkit that allows the easy comparison of the performance of active learning methods over a series of datasets. The toolkit allows such strategies to be constructed by combining a judgement aggregation model, task selection method and worker selection method. The toolkit also provides a user interface which allows researchers to gain insight into worker performance and task classification at runtime